Skip to content

feat: Add migrations-functional-testing agent skill for Dataflow pipeline testing#3883

Merged
aasthabharill merged 8 commits into
mainfrom
add-functional-testing-skill
Jun 5, 2026
Merged

feat: Add migrations-functional-testing agent skill for Dataflow pipeline testing#3883
aasthabharill merged 8 commits into
mainfrom
add-functional-testing-skill

Conversation

@aasthabharill
Copy link
Copy Markdown
Member

@aasthabharill aasthabharill commented Jun 4, 2026

b/519408868

This PR introduces the migrations-functional-testing agent skill under .agent/skills/. This skill equips the AI agent with a modular, gated workflow to perform end-to-end functional testing of Dataflow templates against local code changes using GCP resources.

IMPORTANT: The skill clearly mentions that it is to be used ONLY for migrations specific pipelines i.e. sourcedb-to-spanner, spanner-to-sourcedb, datastream-to-spanner and gcs-spanner-dv as it's written only keeping these in mind.

Key Features

  1. Topology & Schema Planning: Analyzes repository code changes, maps source/sink configurations, and plans test cases.
  2. Autonomous Provisioning: Autonomously provisions ephemeral Spanner and Cloud SQL instances once the proposed plans are approved.
  3. Database Credential Management: Safely requests database user credentials or offers to autonomously generate a temporary DB user and password for Cloud SQL.
  4. Optimized Staging: Stage jobs using standard public Dataflow template paths or packages local modifications using optimized Maven flags (skipping Spotless, Checkstyle, and unit tests to minimize latency).
  5. Custom & Sharding Transformations: Detects if a custom/sharding JAR is required, plans the transformation logic, builds the JAR, uploads it to GCS, and wires the paths into the Terraform configurations.
  6. Strict Safety & Approval Gates: Enforces "Stop and Wait" gates for code analysis, schema setups, configuration files, and Terraform variables. Once approved, the agent executes tasks (like running Terraform apply) autonomously.
  7. Verification & Success Criteria: Proposes destination table verification queries, explicit DLQ checks (dlq/ and filteredEvents/ GCS folders), and generates a final markdown verification report.

Files Added/Modified

  • [NEW] SKILL.md: The core instruction set defining the orchestrator workflow, safety gates, and automation rules.
  • [NEW] TEST.md: A step-by-step manual test case designed for reviewers to verify the skill using a custom transformation scenario.
  • [NEW] skills_index.md: Added the index references to map to the new skill directory and name.

Verification Run

2 tests were done:

  1. Testing Custom transformation in sourcedb-to-spanner without any new changes
  2. Testing a new feature to support custom transformations in gcs-spanner-dv (Github PR)

Analysis:

  • Since a lot of these changes are gated with user approval, it is a bit hands on but that's by design.
  • It does improve the functional testing experience by a lot as most things are done autonomously by agent and requires user attention mainly at the beginning (during curation of test cases and setup).
  • The test cases do require some attention and customization from user - this is super important.
  • The setup creation is quite good as per the pre-decided test cases and requires minimal prompting from user and the agent is able to clearly define the success criteria, etc.
  • Sometimes the agent might prompt the user to do some steps (example create the custom transformation jar) but the agent is able to do it by itself once prompted from user - there were some improvements made to better this experience
  • The agent was able to successfully utilise the debugging skill to track job progress, debug issues, make changes and re-run the job. The follow up jobs were a success.
  • The agent tracks job progress and autonomously queries destination and prepares test success report clearly stating successes and failures after test execution.

Follow up work

The test cases do require some attention and customization from user. There will be a follow-up effort to add a skill to improve creating edge cases which will be referred to here once its completed.

@aasthabharill aasthabharill changed the title initial changes [Migrations] Add skill to functionally test PRs Jun 4, 2026
@aasthabharill aasthabharill changed the title [Migrations] Add skill to functionally test PRs feat: Add migrations-functional-testing agent skill for Dataflow pipeline testing Jun 4, 2026
@aasthabharill aasthabharill added the addition New feature or request label Jun 4, 2026
@aasthabharill aasthabharill marked this pull request as ready for review June 4, 2026 09:08
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request adds a new functional testing skill for the AI agent, designed to streamline the validation of Dataflow migration templates. By automating the provisioning of ephemeral GCP resources and integrating verification steps, the skill reduces manual overhead while maintaining strict safety and approval requirements for infrastructure changes.

Highlights

  • New Agent Skill: Introduced the migrations-functional-testing skill to enable autonomous, gated end-to-end functional testing for specific Dataflow migration templates.
  • Workflow Automation: The skill automates environment provisioning, configuration staging, and verification reporting while enforcing strict safety gates for user approval.
  • Template Support: Supports functional testing for sourcedb-to-spanner, spanner-to-sourcedb, datastream-to-spanner, and gcs-spanner-dv templates.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@aasthabharill aasthabharill requested a review from manitgupta June 4, 2026 09:08
@aasthabharill aasthabharill marked this pull request as draft June 4, 2026 09:09
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the migrations-functional-testing skill, which provides a modular and gated workflow for functionally testing local Dataflow pipeline changes. It includes the skill definition, test cases, and an updated skills index. The review feedback highlights two main issues: a typo in the directory path for the smt-e2e-dataflow-debugging skill in the index file, and a non-portable shell command used for generating unique run IDs in the skill definition which can fail on macOS/BSD platforms.

Comment thread .agents/skills_index.md
Comment thread .agents/skills/smt_functional_testing/SKILL.md Outdated
@aasthabharill aasthabharill marked this pull request as ready for review June 5, 2026 08:41
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new agent skill, smt-functional-testing, which functionally tests local Dataflow pipeline changes against the main branch using ephemeral GCP resources and gated approvals. The changes include the skill definition, test cases, and registration in the global skills index. The review feedback suggests improving the portability of the run ID generation command in SKILL.md to avoid errors on macOS, and referencing defined environment variables directly in the teardown script instead of using generic placeholders to enable full automation.

Comment thread .agents/skills/smt_functional_testing/SKILL.md Outdated
Comment thread .agents/skills/smt_functional_testing/SKILL.md Outdated
aasthabharill and others added 2 commits June 5, 2026 14:17
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@aasthabharill aasthabharill merged commit 1c06683 into main Jun 5, 2026
12 checks passed
@aasthabharill aasthabharill deleted the add-functional-testing-skill branch June 5, 2026 11:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

addition New feature or request size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants